SciKnow.io
SciKnow.io Consulting
CAIS 2026 · Demo Paper · San Jose, CA · 26 May 2026

Complex Knowledge Curation using Agentic Ontological Notebook Memory

Gully A. Burns · Paul Groth gullyburns@gmail.com p.t.groth@uva.nl
University of Amsterdam
01 · Motivation
Agents make ontological commitments. Most hide them.

Every agent that persists memory must make a semantic commitment to a specific data schema, either implicitly or explicitly.

In today's agent stacks those commitments are buried in prompts, tool definitions, and heuristics — implicit, untestable, hard to refine.

We make them explicit. Rather than using flat vector or markdown memory, we showcase a typed knowledge graph in TypeDB. Domain concepts have names, types, and relations. The schema is the agent's 'world model'.

02 · Systems Architecture
Introducing Skillful Alhazen.

The system is based on standard agentic skills. Each bundles a TypeDB schema extension, Python scripts, and prompts (SKILL.md).

System architecture: Agent → Skills → TypeDB → Dashboard
03 · Skills = Domain Models
'Skills' are Curation Apps.

Curation — organizing disparate information into a well-formed knowledge representation.

Schema

The notebook schema is a type hierarchy with three high level categories of entities.

COLLECTIONS THINGS RELATIONS Standard KG representation of domain objects — corpora, papers, diseases, genes, jobs, people.
ARTIFACTS FRAGMENTS Information entities that provide evidence for the definition of KG elements.
NOTES LLM analysis output saved to the knowledge graph.
Process

The curation process depends on the skill, but generally consists of a well-ordered pattern.

01 Goal Definition
Interview to define success criteria
02 Discovery
Identify candidates, confirm for investigation
03 Ingestion
Collect artifacts: repos, docs, pages, PDFs
04 Sensemaking
Structured note-taking, typed knowledge
05 Analysis
Visualizations, queries, comparisons
06 Reporting
Synthesis against success criteria
04 · Skills
Core skills ship with the framework.
Tech Recon

Goal-driven technology investigation.

Success criteria are first-class entities. Candidates flow through a typed pipeline — candidate → confirmed → ingested → analyzed — so progress is itself a queryable graph property.

Ask Claude "Map agentic memory systems against my success criteria — how do MemGPT, Mem0, Zep compare?"
Tech Recon dashboard
Curation Skill Builder

Create schema, scripts, prompts, and dashboard for a new skill.

Success criteria are first-class entities. Candidates flow through a typed pipeline — candidate → confirmed → ingested → analyzed — so progress is itself a queryable graph property.

Ask Claude "Map agentic memory systems against my success criteria — how do MemGPT, Mem0, Zep compare?"
Agentic Memory

TypeDB-backed ontological memory.

Introspects the live schema, composes TypeQL queries dynamically, and combines graph traversal with semantic search for three-stage retrieval: plan, execute, organize with provenance.

Ask Claude "What do I know about schema evolution approaches for TypeDB?"
Tech Recon dashboard
05 · Demo Skills
Domain skills show breadth.
Job Hunt

Personal career pipeline tracking.

Ask Claude "What skill gaps appear across my top 5 prospects?"
DisMech

Rare-disease mechanism curation.

Monarch Initiative DisMech data mapped to Alhazen Notebook via Claude-generated GLAV rules. 1,068 disorders with phenotypes (HPO), causal genes, treatments (MAXO), and PubMed evidence.

Ask Claude "Which diseases involve WNT/β-catenin but have no documented treatment?"
05 · Compare RAG vs TypeDB
RAG fails for structural queries.

DisMech benchmark — 13 questions, 3 categories, same corpus. Ground truth computed deterministically by scanning YAML (no LLM in scoring).

Question type TypeDB RAG
Pathway aggregation"How many diseases involve WNT/β-catenin mechanisms?" 0.75 0.00
Absence detection"Which diseases have phenotypes but no genetic entries?" 0.68 0.00
Global ranking"Top 5 diseases by number of mechanisms?" 0.77 0.03

RAG's failures are architectural — aggregation requires counting; absence requires evidence of non-existence; ranking requires ordering by structural property. Typed graphs have these primitives natively.

06 · Ontology Design
Distinctions create clarity.

What is the best way to model a person in the system? As an author, a job-hunter, a company contact? The answer is all three — via formal role-bearing relations (UFO/OntoUML).

  • A person is rigid — strip the institution, they remain.
  • A role is anti-rigid — it comes and goes with context.
  • Role-specific data sits on the role, never polluting the person node.
07 · Schema Evolution
The notebook learns from from experience.

When the agent encounters something the current schema cannot represent, it records the gap as a note and files a GitHub issue. The coding agent then extends the schema; GLAV mapping rules migrate prior data; curation continues.

Schema evolution loop
08 · Try it at the booth
Clone the repo, build a skill.
1"Develop a job-search for a fictious person."
2"Run a tech recon on a research question of interest."
3"Build a novel curation skill that serves a need."
Code
GitHub repo
Paper
doi.org / ACM DL
Quick-start
Wiki & setup